Annotating Information Structure in a Corpus of Spoken Danish
نویسنده
چکیده
This paper presents the work done to annotate a corpus of spoken Danish with information structure tags, and describes a preliminary study in which the corpus has been used to investigate the relation between focus and intra-clausal pauses. The study indicates that the pauses that do fall within the focus domain, tend to precede property-expressing words by which the object in focus is distinguished from
منابع مشابه
Information Structure and Pauses in a Corpus of Spoken Danish
This paper describes a study in which a corpus of spoken Danish annotated with focus and topic tags was used to investigate the relation between information structure and pauses. The results show that intra-clausal pauses in the focus domain, tend to precede those words that express the property or semantic type whereby the object in focus is distinguished from other ones in the domain.
متن کاملDiscourse Annotation In The Monroe Corpus
We describe a method for annotating spoken dialog corpora using both automatic and manual annotation. Our semi-automated method for corpus development results in a corpus combining rich semantics, discourse information and reference annotation, and allows us to explore issues relating these.
متن کاملStress, pauses, pronominal types and pronominal functions in Danish spoken data
In this paper we present a study of the relation between types of third personal singular neuter pronoun and their functions in Danish spoken data where stress information is marked so that personal and demonstrative occurrences of the pronouns can be distinguished. This study confirms that there are language specific differences in the way various types of pronoun are used to refer to abstract...
متن کاملCoherent Back-Channel Feedback Tagging of In-Car Spoken Dialogue Corpus
This paper describes the design of a backchannel feedback corpus and its evaluation, aiming at realizing in-car spoken dialogue systems with high responsiveness. We constructed our corpus by annotating the existing in-car spoken dialogue data with back-channel feedback timing information in an off-line environment. Our corpus can be practically used in developing dialogue systems which can prov...
متن کاملN.b.: A graphical user interface for annotating spoken dialogue
Corpora of transcribed and annotated dialogues are very useful for developing and evaluating the coverage of algorithms for discourse generation and interpretation and dialogue modelling. On the other hand, there is no agreement on the choice of units and conventions for annotating discourse constituents, and the annotation process can be difficult and prone to inconsistencies. This paper prese...
متن کامل